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Abstract 

Characterizing timc-cvolution of allele frequencies in a population is a fundamental problem 
in population genetics. In the Wright-Fisher diffusion, such dynamics is captured by the transi- 
tion density function, which satisfies well-known partial differential equations. For a multi-allelic 
model with general diploid selection, various theoretical results exist on representations of the 
transition density, but finding an explicit formula has remained a difficult problem. In this pa- 
per, a technique recently developed for a diallelic model is extended to find an explicit transition 
density for an arbitrary number of alleles, under a general diploid selection model with recurrent 
parent-independent mutation. Specifically, the method finds the eigenvalues and eigenfunctions 
of the generator associated with the multi-allelic diffusion, thus yielding an accurate spectral 
representation of the transition density. Furthermore, this approach allows for efficient, accurate 
computation of various other quantities of interest, including the normalizing constant of the 
stationary distribution and the rate of convergence to this distribution. 

1 Introduction 

Diffusion processes can be used to describe the evolution of population-wide allele frequencies in 
large populations, and they have been successfully applied in various population genetic analyses in 
the past. Karlin and Taylor (1981), Ewens (2004), and Durrett (2008) provide excellent introduction 
to the subject. The diffusion approximation captures the key features of the underlying evolutionary 
model and provides a concise framework for describing the dynamics of allele frequencies, even in 
complex evolutionary scenarios. However, finding explicit expressions for the transition density 
function (TDF) is a challenging problem for most models of interest. Although a partial differential 
equation (PDE) satisfied by the TDF can be readily obtained from the standard diffusion theory, 
few models admit analytic solutions. 

Since closed- form transition density functions are unknown for general diffusions, approaches 
such as finite difference methods (Bollback et al., 2008; Gutenkunst et al., 2009) and series expan- 
sions (Lukic et al., 2011) have been adopted recently to obtain approximate solutions. In a finite 
difference scheme, one needs to discretize the state space, but since the TDF depends on the pa- 
rameters of the model (e.g. the selection coefficients), the suitability of a given discretization might 
depend strongly on the parameter values, and whether a particular discretization would produce 
accurate solutions is difficult to predict a priori. Series expansions allow one to circumvent the 
problem of choosing an appropriate discretization for the state space. However, if the chosen basis 
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functions in the representation are not the eigenfunctions of the diffusion generator, as in Lukic 
et al. (2011), then one has to solve a system of coupled ordinary differential equations (ODE) to 
obtain the transition density. Lukic et al. (2011) solve this system of ODEs numerically, which may 
introduce potential errors of numerical approximations. 

If the eigenvalues and eigenfunctions of the diffusion generator can be found, the spectral 
representation (which in some sense provides the optimal series expansion) of the TDF can be 
obtained. For the one-locus Wright-Fisher diffusion with an arbitrary number K of alleles evolving 
under a neutral parent-independent mutation (PIM) model, Shimakura (1977) and Griffiths (1979) 
derived an explicit spectral representation of the TDF using orthogonal polynomials. More recently, 
Baxter et al. (2007) derived the same solution by diagonalizing the associated PDE using a suitable 
coordinate transformation, followed by solving for each dimension independently. In a related line 
of research, Griffiths and Li (1983) and Tavare (1984) expressed the time-evolution of the allele 
frequencies in terms of a stochastic process dual to the diffusion, and showed that the resulting 
expression is closely related to the spectral representation. 

The duality approach was later extended by Barbour et al. (2000) to incorporate a general 
selection model. Although theoretically very interesting, this approach does not readily lead to 
efficient computation of the TDF because of the following reason: Computation under the dual 
process requires evaluating the moments of the stationary distribution. Although the functional 
form of this distribution is known (Ethier and Kurtz, 1994; Barbour et al., 2000), the normalization 
constant and moments can only be computed analytically in special cases (Genz and Joyce, 2003), 
and numerical computation under a general model of diploid selection is difficult. Incidentally, this 
issue arises in various applications (e.g., Buzbas et al. 2009, 2011), and it has therefore received 
significant attention in the past; see, for example, Donnelly et al. (2001) and Buzbas and Joyce 
(2009). 

Many decades ago, Kimura (1955, 1957) addressed the problem of finding an explicit spectral 
representation of the TDF for models with selection. Specifically, in the case of a diallelic model 
with special selection schemes, he employed a perturbation method to find the required eigenvalues 
and eigenfunctions of the diffusion generator. Being perturbative in the selection coefficient, this 
approach is accurate only for small selection parameters. Recently, Song and Steinriicken (2012) 
revisited this problem and developed an alternative method of deriving an explicit spectral represen- 
tation of the TDF for the diallelic Wright-Fisher diffusion under a general diploid selection model 
with recurrent mutation. In contrast to Kimura's approach, this new approach is non-perturbative 
and is applicable to a broad range of parameter values. The goal of the present paper is to extend 
the work of Song and Steinriicken (2012) to an arbitrary number K of alleles, assuming a PIM 
model with general diploid selection. 

The rest of this paper is organized as follows. In Section 2, we lay out the necessary mathe- 
matical background and review the work of Song and Steinriicken (2012) in the case of a diallelic 
(K = 2) model with general diploid selection. In Section 3, we describe the spectral representa- 
tion for the neutral PIM model with an arbitrary number K of alleles. Then, in Section 4, we 
generalize the method of Song and Steinriicken (2012) to an arbitrary i^-allelic PIM model with 
general diploid selection. We demonstrate in Section 5 that the quantities involved in the spectral 
representation converge rapidly. Further, we discuss the computation of the normalization constant 
of the stationary distribution under mutation-selection balance and the rate of convergence to this 
distribution. We conclude in Section 6 with potential applications and extensions of our work. 
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2 Background 



2.1 The Wright-Fisher diffusion 

In this paper, we consider a single locus with K distinct possible alleles. The dynamics of the allele 
frequencies in a large population is commonly approximated by the Wright-Fisher diffusion on the 
{K — l)-simplex 

A^_! := {x € M^" 1 : 1 - |x| > 0} , 

where |x| = Yld=i x i- For a given x = (xi, . . . , xk-i) £ Ajf_i, the component Xi denotes the 
population frequency of allele i £ {1, . . . , K — 1}. The frequency of allele K is given by xk = 1 — |x|. 
The associated diffusion generator Jz? is a second order differential operator of form 

= ^ E m^^t/m + E ^w^W' w 

ij'=l * ^ i=l 

which acts on twice continuously differentiable functions / : Ajf-i — >• R. The diffusion coefficient 
6jj (x) is given by 

where the Kronecker delta Si j is equal to 1 if i = j and otherwise. For a neutral PIM model, we 
use &i = ANui to denote the population-scaled mutation rate associated with allele i, where Ui is 
the probability of mutation producing allele i per individual per generation and N is the effective 
population size. Under this model, the drift coefficient a,(x) is given by 

Oi(x) = -(9i - \6\xi), 

where 6 = (9 1 ,...,e K ) 6 and |0| = Y*=\ e i- 

Consider a general diploid selection model in which the relative fitness of a diploid individual 
with one copy of allele i and one copy of allele j is given by 1 + 2si j. We measure fitness relative 
to that of an individual with two copies of allele K, thus sk,k = 0. The diffusion generator in this 
case is given by ££ = JSfb + j where Jz?o denotes the diffusion generator under neutrality and the 
additional term J2? CT , which captures the contribution from selection to the drift coefficient cij(x), is 
given by 

K-l Q 

&a = E Xl ~ ax 7 ' 

i=l 1 

where cjj(x) denotes the marginal fitness of type i and cr(x) denotes the mean fitness of the popu- 
lation with allele frequencies x. More precisely, 

K 

Oj(x) = y^ j (Tj,jX j (2) 

and 

K 

a(x) := (JijXiXj, (3) 
»j'=l 
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where a^j = 2Nsij. Intuitively, the population frequency of a given allele tends to increase if its 
marginal fitness is higher than the mean fitness of the population. The selection scheme is specified 
by a symmetric matrix cr = {(Jij)i<ij<K £ M. KxK of population-scaled selection coefficients, with 
&k,k = 0. 

The operators Jz?o and Jz? are elliptic inside the simplex Ak—i, but not on the boundaries. Thus, 
the precise domain of the generator is not straightforward to describe, but Epstein and Mazzeo 
(2011) give a suitable characterization. 

2.2 Spectral representation of the transition density function 

For t > 0, the time evolution of a diffusion Xj on the simplex Ax-i is described by the transition 
density function p(t;x,y)dy = P[X^ € dy | Xo = x], where x, y £ Ax-i- The transition density 
function satisfies the Kolmogorov backward equation 

J^(t;x,y) =J&?p(t;x,y), (4) 

where Jz? , the generator associated with the diffusion, is a differential operator in x. 

We briefly review the framework underlying the spectral representation of the transition density 
function. The operator Jz? is said to be symmetric with respect to a density ir: A^-i —> K>o if, 
for all twice continuously differentiable functions /: Ak-i — > K and g: Ak-i — > K that belong to 
the domain of the operator, the following equality holds: 

/ [JSf/(x)]0(x)7r(x)dx = / /(x)[jSf0(x)]7r(x)dx. 

A straightforward calculation using integration by parts yields that the diffusion generators de- 
scribed in Section 2.1 are symmetric with respect to their associated stationary densities. 

Theorem 1.4.4 of Epstein and Mazzeo (2011) guarantees that an unbounded symmetric opera- 
tor Jz? of the kind defined in Section 2.1 of this paper has countably many eigenvalues {— Ao, — Ai, — A2, . . .}, 
which are real and non-positive, satisfying 

< A < Ai < A 2 < • • • , 

with A n — > 00 as n — > 00. An eigenfunction B n : Ak-i —> M with eigenvalue — A n satisfies 

Jz^B n (x) = -A nJ B n (x), (5) 

and, furthermore, -B n ( x ) is an element of the Hilbert space L 2 (A^-_i, 7r(x)) of functions square 
integrable with respect to the density vr(x), equipped with the canonical inner product (•, •}„-. If Jz? 
is symmetric with respect to vr(x), then its eigenfunctions are orthogonal with respect to 7r(x): 

(B n ,B m ) n := / B n (x)B m (x)ir(x)dx = 5 n , m d n , 

where 5 n ^ m is the Kronecker delta and d n are some constants. In the cases considered in this paper, 
the eigenfunctions form a basis of the Hilbert space L 2 (A^-_i, vr(x)) . 

It follows from equation (5) that exp(— A n t)B n (x.) is a solution to the Kolmogorov backward 
equation (4). By linearity of (4), the sum of two solutions is again a solution. Combining the initial 
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condition p(0; x, y) = <5(x — y) with the fact that {B n (x)} form a basis yields the following spectral 
representation of the transition density: 

oo ^ 

p(t; x, y) = ^-e~ A ' li i?n(x)S n (y)^(y). (6) 

n=0 n 

The initial condition being the Dirac delta J (x — y) corresponds to the frequency at time zero being 
x. 



2.3 Univariate Jacobi polynomials 

The univariate Jacobi polynomials play an important role throughout this paper. Here we review 
some key facts about this particular type of classical orthogonal polynomials. An excellent treatise 
on univariate orthogonal polynomials can be found in Szegd (1939) and a comprehensive collection 
of useful formulas can be found in Abramowitz and Stegun (1965, Chapter 22). 
The Jacobi polynomials pn' b \z), for z € [—1,1], satisfy the differential equation 

(1 - z 2 )^-P^- + [b- a -(a + b + 2)z]^P- + n(n + a + b + l)f(z) = 0. (7) 
az z dz 

For given a,b> —1, the set {p^ L ' b \z)}'^ =0 forms an orthogonal system on the interval [—1, 1] with 
respect to the weight function (1 — z) a (l + z) b . For a more convenient correspondence with the 
diffusion parameters, we define the following modified Jacobi polynomials, for x € [0, 1] and o, b > 0: 

R^ b) {x)=pt lA - l) {2x-l). 

This definition is slightly different from that adopted by Griffiths and Spano (2010). 

Equation (7) implies that the modified Jacobi polynomials Rn' b \x), for x S [0,1], satisfy the 
differential equation 

x(l - + [a - (a + b)x]^j^- + n(n + a + b - l)f{x) = 0. (8) 

For fixed a, b > 0, the set {R^' b \x)}^ =0 forms an orthogonal system on [0,1] with respect to the 
weight function x a ~ l (l — x) b ~ l . More precisely, 

l 

Rn ' (p^)B"m ^ {p&)X (1 x) dx — &n,m^n ' ; 







where <5 n , m denotes the Kronecker delta and 

(9) 



,(a,6) _ T(n + a)T(n + b) 



(2n + a + b-l)T(n + a + b-l)T(n + l)' 
Note that {Rn' b \x)}'^' =0 form a complete basis of the Hilbert space L 2 ([0, l],x a ~ 1 (l — x)^" 1 ). 
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For n > 1, the modified Jacobi polynomial Rn' b \x) satisfies the recurrence relation 

X RM( X )- (n + a-l)(n + b-l) R (a,b) () 
XUn [X) ~ (2n + a + b - l)(2n + a + b - 2) ^ { ] 



+ 



" 2 -^-^ «(M> W (10) 



2 2(2n + a + 6)(2n + a + 6-2 



while, for n = 0, 



(w + l)(w + a + b-l) p(a,&)/ s 

+ (2n + a + b)(2n + a + b-l) n+l[ h 

xR °' b){x) = irrb R ^ b){x) + ^Tb R ^ b){x) - 

Also, note that R^'^^x) = 1. These recurrence relations play an important role in the work of 
Song and Steinriicken (2012), and the multivariate analogues, discussed later in Section 3.2, are 
similarly important for the present work. 

The modified Jacobi polynomials satisfy other interesting relations, one of them being the 
following: 

R (g,b) (x) _ n + a + b-l ( H1) n + a-l R (a,b+i) ( ) (m 

Using this identity, polynomials with parameter b can be related to polynomials with parameter 
6+1. We utilize this relation later. 

2.4 A review of the K = 2 case 

To motivate the approach to be employed in the general case, we briefly review the work of Song 
and Steinriicken (2012) for deriving the transition density function in the diallelic {K = 2) case. 
The vector of mutation rates is given by = (a,/3), while the symmetric matrix describing the 
general diploid selection scheme can be parametrized as 

2a 2ah 
2ah 

where a is the selection strength and h the dominance parameter. For K = 2, the diffusion is 
one dimensional and the simplex Ai is equal to the unit interval [0, 1]. With x denoting x±, the 
generator (1) reduces to 

JSf/(x) = \x{l - x)^f{x) + {i[a - (a + fix] + 2ax{l - x)[x + h(l - 2x)]]^-f{x). 

In the neutral case (i.e., a = 0), the modified Jacobi polynomials R^^\x) are eigenfunctions of 
the diffusion generator with eigenvalues An ^ = \n(n — l + a + /3). Hence, a spectral representation 
of the transition density function can be readily obtained via (6). 

In the non-neutral case (i.e., a ^ 0), consider the functions S^(x) = e~ & ( x ^ 2 Rh'^\x) , which 
form an orthogonal basis of the Hilbert space L 2 ([0, l],e°~( x ^x a ~ 1 (l — x)^" 1 ), where e a ^x a ~ 1 (l — 
rr)' 3-1 corresponds to the stationary distribution of the non-neutral diffusion, up to a multiplicative 
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constant. Since the eigenfunctions B n (x) of the diffusion generator are elements of this Hilbert 
space, we can pose an expansion B n {x) = Y^=o w n,mS^ n {x) in terms of the basis functions S^(x), 
where w n ^ m are to be determined. Then, the eigenvalue equation 5£B n {x) = —A n B n (x) implies 
the algebraic equation 

oo oo 

wn, m [Xfc® + Q(x;a,P,a,h)\ Ii£>®(x) = A„ J2 w n , m R^\x), 

where Q(x; a, ($, a, h) is a polynomial in x of degree four. Utilizing the recurrence relations in (10) 
and (11), one can then arrive at a linear system Mw n = A n w n , where w n = (w n Q,w n i,w n 2, ■ • •) 
is an infinite-dimensional vector of variables and M is a sparse infinite-dimensional matrix with 
entries that depend on the index n and the parameters a, /3,o~,h of the model. The infinite linear 
system Mw n = A n w n is approximated by a finite-dimensional truncated linear system 

where = (w^l, , • • • , w^D-i) anc ^ is the submatrix of M consisting of its first D rows 
and D columns. This finite-dimensional linear system can be easily solved using standard linear 
algebra to obtain the eigenvalues and the eigenvectors Wn of M^- D \ Song and Steinriicken 
observed that Affl and wl^m converge very rapidly as the truncation level D increases. Finally, 
the coefficients w\^m can be used to approximate the eigenfunctions B n (x), and, together with the 
eigenvalues A\^\ an efficient approximation of the transition density function can be obtained via 
(6). 



3 The Neutral Case with an Arbitrary Number of Alleles 

In this section, we describe the spectral representation of the transition density of a neutral PIM 
model with an arbitrary number K of alleles. As in the case of K = 2, reviewed in Section 2.4, for 
an arbitrary K the eigenfunctions in the neutral case can be used to construct the eigenfunctions 
in the case with selection. The latter case is considered in Section 4. 



3.1 Multivariate Jacobi polynomials 



In what follows, let No = {0, 1, 2, . . .} denote the set of non-negative integers. As in Griffiths and 
Spano (2011), we define the following system of multivariate orthogonal polynomials in K — 1 
variables: 



Definition 1. For each vector n = (n±, . . . ,riK-i) € 
orthogonal polynomial P^(^) is defined as 

K-i r / s Ni 



iK-l 



and 



where Ni 



^(X) 



£j=i+i n i and e i 



n 



1 



i - £S 



,0k) e k£ , the 
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For x = (x\, . . . , xk-i) € Ajf_i, let IIo(x) denote an unnormalized density of the Dirichlet 
distribution with parameter 9 = (#1, . . . , 9k)'- 

k 

n (x) = n^-\ (13) 

1=1 

where xk = 1 — |x|. The following lemma, a proof of which is provided in Appendix A, states that 
the above multivariate Jacobi polynomials -Pn( x ) are orthogonal with respect to IIo(x): 

Lemma 2. For n, m E N^" -1 , 

/ P*(x)P£(x)n (x)dx = <5 n , m C*, 
where 5^ m = Wfj^ S niimi and 

C e n :=Uc^ +m) , (14) 

i=i 

with ci defined in (9) . 

Remark: The multivariate Jacobi polynomials form a complete basis of L (Ak-i, n (x)), the 
Hilbert space of functions on A^-i square integrable with respect to the unnormalized Dirichlet 
density IIo(x). 

3.2 Recurrence relation for multivariate Jacobi polynomials 

Recall that the univariate Jacobi polynomials satisfy the recurrence relation (10). Theorem 3.2.1 
of Dunkl and Xu (2001) guarantees that the multivariate Jacobi polynomials satisfy a similar 
recurrence relation. More precisely, we have the following lemma, the proof of which is provided in 
Appendix A: 

Lemma 3. Given n = (m, . . . , uk-i) £ ^o~ l an d m = ( m ii • • • ,mK—i) £ ^o~ l > define Nj = 
Yli=j+i n i an d Mj = Yli=l+i m i- F° r given i € {1, . . . , K — 1} and n, -P„ (x) satisfies the recurrence 
relation 

z,i*(x) = ^ rggj&x), (15) 

mGA4i(n) 

where rn,m are known constants (provided in Appendix A ) and 

Mi{n) := jm £ N^" -1 : Mj = Nj for all j > i and \Mj - Nj\ < 1 for all j <i\. (16) 

Impose an ordering on the (K — l)-dimensional index vectors n € N^" 1 . Then, the recur- 
rence (15) can be represented as 

Xii£(x)= J2 [^]n,m P mW. 
meMi(n) 



s 



where Qf corresponds to an infinite dimensional matrix in which columns and rows are indexed by 
the ordered (K — l)-tuples, and the (n, m)-th entry is defined as 

[Qi Jn,m = < (17) 

I U, otherwise. 

Note that for each given n, the number of non-zero entries in every row of Qf is finite. One can 
deduce the following corollary from the new representation: 

Corollary 4. Let a = (on) ngN ^-i be such that X^ ne N K_1 a nCn < 00 • rfaen 



w/iere (&n) n6N *--i = a ■ Qf . 

Remark: Since under multiplication X\ commutes with Xj for 1 < i,j < K — 1, the corresponding 
matrices Qf and C/f also commute. 

3.3 Eigenfunctions of the neutral generator Jz? 

It is well known that the stationary distribution of the Wright-Fisher diffusion under a neutral 
PIM model is the Dirichlet distribution (Wright, 1949). The density of the Dirichlet distribution 
is a weight function with respect to which the associated diffusion generator «2o is symmetric. As 
discussed in Section 3.1, the multivariate Jacobi polynomials -P„(x) are orthogonal with respect 
to the weight function ITo(x), which is equal to the density of the Dirichlet distribution up to a 
multiplicative constant. Given the discussion in Section 2.2, one might then suspect that -P„(x) 
are potential eigenfunctions of _2?o- The following lemma establishes that this is indeed the case: 

Lemma 5. For all n € N^'" 1 , the multivariate Jacobi polynomials -P„(x) satisfy 

J^(x) = -Af n| P n e (x), 

where 



A? n| = >|(|n|-l + |0|). (18) 



1 

V| - 2' 

That is, -Pn(x) are eigenfunctions of Jz?o with eigenvalues — A^|. 

A proof of this lemma is deferred to Appendix A. We conclude this section with a few comments. 
Remarks: 

i) Substituting the eigenvalues and eigenfunctions into the spectral representation (6), we obtain 

P (t;x,y)= E 7^~ A ' n|t ^(*)^(y)no(y). (19) 

ii) For every n € N^" 1 , note that Aj^| only depends on the norm |n|, which implies degeneracy 
in the spectrum of J2?o- Griffiths (1979) constructed orthogonal kernel polynomials indexed by 
|n|, that is the sum over all orthogonal polynomials with index summing to |n|, and obtained 
the transition density expansion (19). 
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4 A General Diploid Selection Case with an Arbitrary Number of 
Alleles 



In this section, we derive the spectral representation of the transition density function of the Wright- 
Fisher diffusion under a if-allelic PIM model with general diploid selection. This work extends the 
work of Song and Steinriicken (2012), the special case of K = 2 briefly summarized in Section 2.4, 
to an arbitrary number K of alleles. The recurrence relation presented in Lemma 3 plays a crucial 
role in the following derivation. 

Recall that the backward generator for the full model is Jz? = Jz?o + -^o-, where ££q corresponds 
to the generator under neutrality and Jz? CT corresponds to the contribution from selection. The 
diffusion has a unique stationary density [see Ethier and Kurtz (1994) or Barbour et al. (2000)] 
proportional to 

n(x) := e*Wn„(x), (20) 

where IIo(x) is defined in (13) and <r(x) is the mean fitness defined in (3). As mentioned in 
Section 2.2, Jz? is symmetric with respect to n(x). For n G No, we aim to find the eigenvalues — A n 
and the eigenfunctions B n of J§? such that 

JSf5 n (x) = -A n £ n (x). (21) 

By convention, we place A n in non-decreasing order. The symmetry of Jzf implies that {B n (x.)} 
form an orthogonal system with respect to n(x), that is 

B n (x)S m (x)n(x)dx oc 6 n , m . 

Ajf-i 

Such a system of orthogonal functions, however, is not unique. The orthogonality of {-P„} with 
respect to Ilo, established in Lemma 2, can be used to show that the functions 

fl*(x) := P n (x) e -^)/ 2 (22) 

are orthogonal with respect to II, as are B n (x). Furthermore, the fact that {P®(x)} form a 
complete basis of L 2 (Ak~i, ITo(x)) means that {5„(x)} is a complete basis of L 2 (A^_i, n(x)) . 
Since B n € L 2 (Ajf-i, n(x)) , we thus seek to represent B n (x) as linear combination of the basis 
fl*(x): 

B„(x)= £ u„, m 5^(x), (23) 

where u n , jm are some constants to be determined. 

Define an index set I = u£_ {!,... ,K — 1} L and for i = ■ ■ £ I, define Xi = x- Ll ■ ■ ■ Xi L . 
We have the following theorem for solving the eigensystem associated with the full generator Jzf: 

Theorem 6. For all n 6 No, the eigenfunction B n (x) of ££ can be represented by (23). The corre- 
sponding eigenvalues —A n and the coefficients u n , m can be found by solving the infinite- dimensional 
eigensystem 

u n M = u n A n , (24) 

where u n = («n,m) meN «'-i and 

M = diag ({Af m| } meNrl ) +"£q(l)g?. 

iei 
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Here, A^ m | is defined as in (18) and, for i = . . . , i^) G I, we define Q? = • • • Qf L , where Qf 

is given in (17). When L = 0, Qf is defined to be the identity matrix. Explicit expressions of the 
constants q{\) are provided in Appendix C. 



Remarks: 

i) Although M is infinite dimensional, it is in fact sparse with only finitely many non-zero entries 
in every row and column. 

ii) Equation (24) implies that A n and u n are in fact the left eigenvalues and eigenvectors of M. To 
solve the eigensystem in practice requires some truncation of the matrix to finite dimensions, 
as in the K = 2 case described in Section 2.4. For a given n £ No, we would like both A n and 
u n ,m to converge as the truncation level increases. In Section 5, we demonstrate that this is 
indeed the case using empirical examples. 

We now provide a proof of the theorem. 

Proof of Theorem 6. Substituting (23) into (21) we obtain 

Y «n,k^Sg(x) = - A re U n>k ^( X ). 



keN,' 



It is shown in Appendix D that 



J2ffl£(x) 



where 



Q(x;cr,0) 



K 



-ff(x)/2 



K 



Af k| P k e (x)+Q(x ;( T,0)P k (x) 



(25) 



K 



Y XiO$(x) + Y 0i<Ti(x) + Y X i a i,i ~ (1 + |0|M X ) " 



i=l 



i=l 



i=l 



with <Tj(x) and <r(x) defined as in (2) and (3), respectively. Thus, one arrives at the following 
equation: 

Y A nUn , k P k e (x) = Y ^n,k[Af k |P k (x) + Q(x ;< T,0)P k (x)]. (26) 



We solve the equation by first representing Q(x; <x, #)P k (x) as a finite linear combination of 
{P k (x)} kgN A-i . Observe that Q is in fact a fourth-order polynomial in x. Collecting terms, 
Q can be written in the form 

Q(x;<r,0) = Y<l(i)xh (27) 
iex 

for the constants q(i) given in Appendix C. Applying Corollary 4 recursively, we obtain 

g(x;<r,0)j£(x) = £ ? (i) Y [tfMftt- 

Finally, substituting this equation into (26), multiplying both sides of (26) by P^x), and integrat- 
ing with respect to IIo(x) over the simplex Ak—i yields the matrix equation (24). □ 
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5 Empirical Results and Applications 



In this section, we study the convergence behavior of the eigenvalues and eigenvectors as we approx- 
imate the solutions of (24). Further, we show how the spectral representation can be employed to 
obtain the transient and stationary density explicitly (especially the normalizing constant), and to 
characterize the convergence rate of the diffusion to stationarity. A Mathematica implementation 
of the relevant formulas for computing the spectral representation is available from the authors 
upon request. 



5.1 Convergence of the eigenvalues and eigenvectors 

In what follows we order the Jacobi polynomials according to the graded lexicographic ordering of 
their corresponding indices. Thus < P^ 2 if 

• ni | < |ri2 1 , or 

• ni| = |n2 1 and ni is lexicographically smaller than U2- 

Fix K and note that, for a given truncation level D G No and I 6 No, there are i^^S^) polynomials 
with |n| = /, and U(D) := ( D ~j < K i 1 1 ) polynomials with index |n| < D. For the computations in 

the rest of this section we chose K = 3, unless otherwise stated. 

Now, one can obtain a finite-dimensional linear system approximating (24) by truncation, that 

is, taking only those entries in M and u n whose associated index vectors satisfy |n| < D. More 

explicitly, with = ([M] k>1 ) G R^( D )^( D ) an d u|f ] = (u n , k ) € R U ( D \ where k, 1 G N^ -1 such 

that |k|, |1| < D, the solutions of 



should approximate the solutions of the infinite system A n and u n . The convergence patterns of 
An and u^j. as D increases are exemplified in Figure 1 for the parameters 

12 14 15 

i) K = 3,0 = (0.01,0.02,0.03), a = a x := | 14 11 13 | ; and 

15 13 

120 140 150 

ii) K = 3,0 = (0.01,0.02,0.03), a = <x 2 := | 140 110 130 

150 130 

Figure 2 displays the convergence behavior for the parameters 

iii) K = 3,0 = (10, 20, 30), a = err, and 



iv) K = 4,0 = (0.01, 0.02, 0.03, 0.04), cr a 



In all cases, A [ n ] and 

u n°L conver g e with increasing truncation level to empirical limits. The 
eigenvalues A^f' decrease towards the empirical limit, whereas the coefficients uj^jj. show oscillatory 
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behavior before ultimately stabilizing. The rate of convergence is faster for smaller selection inten- 
sity. Varying the mutation parameters does not influence convergence behavior significantly. As 
expected, Aq^ converges rapidly to zero in all cases, consistent with the fact that the diffusion has 

a stationary distribution. For a fixed n, A^ and its associated coefficients roughly converge 
at similar truncation levels. 

Figure 3 shows for D = 24 and < n < 35 under neutrality (cr = 0) and selection (<x = cr\ 
and cr = \cr2)- Upon inspection all of the eigenvalues displayed have converged properly. Under 
neutrality, the eigenvalues are functions of |n|, thus they are degenerate and cluster into groups. In 
the presence of selection, however, we empirically observe that all of the eigenvalues are distinct. 
For moderate selection intensity, the group structure is less prominent. In general, increasing the 
selection parameters evens out the group structure and shifts the entire spectrum upward. 

Computing the transition density function for large selection coefficients requires combining 
terms of substantially different orders of magnitude, because of the exponential weighting factors 
in the density (20) and in the expansion (22) . Therefore, the coefficients uj^j. have to be calculated 
with high precision to obtain accurate numerical results under strong selection. 

5.2 Transient and stationary densities 

The approximations to the eigenvalues and the eigenfunctions B n (via the eigenvectors u[f' 
and equation (23)) can be used in the spectral representation (6) to approximate the transition 
density function at arbitrary times t. Examples with a = <j\ for different times are given in 
Figure 4. At first, the density is concentrated around the initial frequencies x = (0.02,0.02,0.96), 
but as time increases, the frequencies of the first and second allele increases, since these have a 
higher relative fitness. Eventually, the transition density converges to the stationary distribution 
(similar to distribution at t = 2), where the bulk of the mass is concentrated at high frequencies 
for the first and second allele. 

The eigenvalues Ajf^ and coefficients u^j. can also be employed to approximate the constant 
that normalizes the stationary distribution II(x) to a proper probability distribution. Following 
the same line of argument as Song and Steinrucken (2012), the orthogonal relations enable us to 
circumvent the difficulty involved in directly evaluating a multivariate integral over the simplex 
Ak-i- First, note that since Jz? maps constant functions to zero, any constant function is an 
eigenfunction with associated eigenvalue Ao = 0, thus Bq(x) = Bo(y) = const. In (6), taking 
t — > oo, we get 
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Figure 1: Convergence of the truncated eigenvalues A n and coefficients of the eigenvectors u n 
as the truncation level D increases, for K = 3 with low mutation rates. Subfigures (a) and (b) 
show Aq D ^, and A^q for o 

75 (8 2)' u 75 (7 3)> an< ^ u 7^(6 4) ^ or a = a i an d CT = °"2> respectively. The mutation rates were set 
to 6 = (0.01, 0.02, 0.03) for all computations. 



<T\ and cr = <J2, respectively. Subfigures (c) and (d) show 
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Figure 2: Convergence of the truncated eigenvalues A n and coefficients of the eigenvectors u n 
as the truncation level D increases, for K = 3 with high mutation rates and for K = 4 with 



low mutation rates. Subfigures (a) and (c) show for n = 0, 75, 150, and ut, 5 , 
(8, 2), (7, 3), (6, 4), respectively, for mutation rates = (10, 20, 30) and selection coefficients cr = er\. 

4 is shown in subfigures (b) and (d) for with n = 0, 75, 150, 
(3, 2, 4), (5, 3, 4), (3, 5, 2), respectively; the mutation rates were set to = 
(0.01,0.02,0.03,0.04) and the selection coefficients to cr = cr 3 . 
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Figure 3: The first 36 eigenvalues of the different spectra for the selection parameters cr = 0, a\ 
and !<t 2 , respectively. The latter was chosen so that the ranges of the eigenvalues are comparable. 
The truncation level D was set to 24 and mutation rates 6 = (0.01,0.02.0.03) were used. 



Then for x = y = 0, by (23) we have 

n f tt( \j (Bo,B )n 

Cn = A„._, n(z><iz = -BSSF 

2 



^ {0) (£ m6N -^o, m P£(o)) < 28 ) 



since a(0) = a^,^ = and i?i /' ej+27Vj) (O) = (-l)" j rSp ' Here C m is the constant defined 
in (14). The purely algebraic form of the right hand side in equation (28) allows to compute an 
accurate approximation of the normalizing constant Cn by replacing the infinite sums by sums over 
all indices m such that |m| is less or equal then a given truncation level. This offers an attractive 
alternative to other computationally intensive methods (Donnelly et al., 2001; Genz and Joyce, 
2003; Buzbas and Joyce, 2009). 

Figure 5 shows two examples of stationary distributions for different selection coefficients. In 
Figure 5(a) the stationary density is concentrated in the interior of the simplex, since all homozy- 
gotes are less fit then the heterozygotes. This situation is referred to as heterozygote advantage, 
resulting in a balancing selection pattern, and the different alleles co-exist at stationarity. In 
Figure 5(b), allele number 1 is strongly favored by the given selection coefficients, and thus the 
stationary density is concentrated at high frequencies for this allele. 

We can also use (6) to investigate the rate of convergence of the diffusion process to the station- 
ary distribution. Denote the difference between the transition density and the stationary density 
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t=0.04 



t=0.2 




Figure 4: Approximation of the transition density function (6) for different times t 6 
{0.04,0.2,1.0,2.0}. Selection was governed by the matrix of coefficients a = a\ and x = 
(0.02,0.02,0.96) was used as initial condition. The truncation level was set to D = 40, whereas 
the summation in equation (23) ranged over all m such that < |m| < 36, and all eigenfunctions 
and eigenvalues with < n < 561 were included in equation (6). The mutation rates were set to 
6 = (0.01, 0.02.0.03). The plots only vary in y\ and 1/2, since 2/3 = 1 — yi — D2- 
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Figure 5: Two examples of the stationary distribution for different selection parameters. The 
mutation rates were set to 6 = (0.01,0.02.0.03) in both cases. Again, a truncation level of D = 40 
was used, the summation in equation (28) ranged over all m such that < |m| < 36. The plots 
only vary in y x and y 2 , since y 3 = 1 - y 1 - y 2 . 
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t 

Figure 6: Convergence of the transition density to stationarity as time evolves, for initial frequencies 
x = (0.02, 0.02, 0.96) T . Deviation from the stationary density is measured by \\d(t; x, • ) 1 1 l/n ' defined 
in (29). The mutation rates were chosen to be 6 = (0.01,0.02,0.03) and the selection parameters 
were a = O.leri, er = 0.5eri and er = O.leri, respectively. The truncation level was set to D = 40, 
and (29) was approximated by summing over < n < 561 and m, k such that < |m|, |k| < 36. 



by 



d(i;x,y) 



p(t;x,y)-— n(y) 



n=l 



-A n t 



n(y) 



£ n (x)£? n (y) 
(B n , B n )u 



We measure the magnitude of d(t; x, y) by the square of its 1? norm with respect to the weight 
function 1/Il(y), that is, 



00 R 

\\d(t;x,.)\\l /n := <d,d>i/n = E^"' /R 

(%, rS n )u 



n=l 



n=l 



-ff(x) 



~2A n t 



EkeN^- 1 "".k^k (x) 



(29) 



E 



zr 1 I/ 2 C e 



Again, the sums in this expression can be approximated by truncating at a given level. Figure 6 
shows ||d(t;x, Oil l/n as a function of time t, for a = cri, a = 0.5<ti, a = O.leri. The initial 
frequencies were x = (0.02,0.02,0.96). As expected, the distance to the stationary distribution 
decreases over time. Further, the rate of convergence is faster if the values in er get larger, which 
was observed by Song and Steinriicken (2012) too. We note that the spectral representation can 
also be readily employed to study convergence rates measured by other metrics such as the total 
variation distance or relative entropy. 
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6 Discussion 



In this paper, we have extended the method of Song and Steinriicken (2012) to obtain an explicit 
spectral representation of the transition density function for the multi-dimensional Wright-Fisher 
diffusion under a PIM model with general diploid selection and an arbitrary number of alleles. 
We have demonstrated the fidelity and fast convergence of the approximations. Further, as an 
example application of our work, we have computed the normalization constant of the stationary 
distribution and quantified the rate at which the transition density approaches this distribution. 

Efficient approximations of the eigensystem and the transition density function lead to a number 
of important applications. Combining the stationary distribution discussed in Section 5.2 with the 
recurrence relation shown in Lemma 3, one can calculate algebraically the probability of observing 
a given genetic configuration of individuals sampled from the stationary distribution of the non- 
neutral diffusion. This kind of algebraic approach would complement previous works (Evans et al., 
2007; Zivkovic and Stephan, 2011) on sample allele frequency spectra that involve solving ODEs 
satisfied by the moments of the diffusion. Further, the algebraic approach is potentially more 
efficient than computationally expensive Monte Carlo methods (Donnelly et al., 2001) and more 
generally applicable than methods relying on the selection coefficients being of a certain form 
(Genz and Joyce, 2003). Note that, by discretizing time and space, our representation of the 
transition density function can be used for approximate simulation of frequencies from stationarity 
as well as frequency trajectories, which can in turn be employed in the aforementioned Monte Carlo 
frameworks. 

The sampling probability can be applied, for example, to estimate evolutionary parameters 
via maximum likelihood or Bayesian inference frameworks. Furthermore, the notion of sampling 
probability can be combined with the spectral representation of the transition density function 
in a hidden Markov model framework as in Bollback et al. (2008), to calculate the probability 
of observing a series of configurations sampled at different times. The method developed in this 
paper would allow for such an analysis in a model with multiple alleles subject to recurrent parent- 
independent mutation and general diploid selection. 

An important, albeit very challenging, future direction is to extend our current approach to an- 
alyze the dynamics of multi-locus diffusions with recombination and selection. Such an extension 
would allow for the incorporation of additional data at closely linked loci, which has the poten- 
tial to significantly improve the inference of evolutionary parameters, especially the strength and 
localization of selection. We have only considered Wright-Fisher diffusions in a single panmictic 
population of a constant size. As achieved in the alternative approaches of Gutenkunst et al. (2009) 
and Lukic et al. (2011), mentioned in Introduction, it would be desirable to generalize our approach 
to incorporate subdivided populations exchanging migrants, with possibly fluctuating population 
sizes. Another possible extension is to relax the PIM assumption and consider a more general 
mutation model. 

We note that our present technique relies on the diffusion generator being symmetric. This 
symmetry does not hold in some of the scenarios mentioned above, making a direct application of 
the ideas developed here difficult. However, we believe that it is worthwhile investigating whether 
one could apply our approach to devise approximations to the transition density function that are 
sufficiently accurate for practical applications. 
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A Proofs of lemmas 

Proof of Lemma 2. For two indices n, m 6 N^ _1 , consider the integral 



P«(x)p£(x)Ii (x)dx = / P n e (^)^(On (x(O)|det( J Dx)(^)|^, (30) 



Ajc-i J [OA] 

where the right hand side can be obtained by the coordinate transformation introduced in Ap- 
pendix B and the multivariate integration through substitution rule. Using 

1=1 

the determinant of the Jacobian 

K-2 

idetp X )(oi = nc 1 -**)*-^, 

i=i 

see Baxter et al. (2007) [Equation B.l], and the transformed Jacobi polynomials (37), it can be 
shown that 



holds, with 



K-l 

(« Jl e i +2JV J ) 

•i 



c ° = n < 



In the case n = m this can be seen immediately. If n ^ m without loss of generality let 1 < I < K — 1 
be the largest i such that n/ < m/ and = for all k = I + 1, . . . , K — 1. Then JVj = 
(recall Nk-i = Mk-i = 0) and -Rm|' e!+2A/ ^(£z) is orthogonal to all polynomials of lesser degree 
with respect to the weight function £^'~ 1 (1 — ^) e>!+2A/ '~ 1 , and thus the l-th. factor and the whole 
product is zero. □ 

Proof of Lemma 3. We found it most convenient to derive a recurrence relation for 

XiJ^(x) (31) 

by projecting expression (31) onto the orthogonal basis {P^x)}, and investigate the respective 
coefficients. First, note that the coordinate transformation introduced in Appendix B yields Xi = 

&rwi-&),so 

*ii2(x) = &n(W;)i£(0- (32) 

j<i 



21 



Further, integrate expression (32) against the base function -P„(£) times the weight function Ilo to 
get the respective coefficient in the basis representation. Using the integration by substitution rule 
again, as in equation (30), this yields 



K-l 



C m 7 [0 ,i]A-i f£ ^ 

(%,e 3 -+2M,) / ^ l^J^ U-4j) a4. 

;— Cm. ■'0 



1 



(6»i,ei+2Mi) 
Cm; 



o — 1 Cm. •'0 



i-1 

X 

j=l c mi 



The first term on the right hand side yields zero, unless mj = nj for all j > i, thus Mj = iVj. 
In this case the term is equal to 1. Since m,- = nj for j > i, note that the second term on the right 
hand side is of the form 

Cml J 

with io a)/ g(^) = £ a_1 (l — £)^ _1 > a = an( i = + 2iVj. Here we applied the recurrence 
relation (10) to £Rn*'^ (£) and used the orthogonality of the Jacobi polynomials. The constants 
are given by 



Q{a,b) 



if n — m = 1 and n > 0, 
-, if n — m = and n > 0, 



(2ra+a+t>-l)(2n+a+ft-2) ' 

1 _ b 2 -a 2 -2(fe-a) 

2 2(2n+a+fe)(2n+a+b-2) : 

^ n+1 lw^" a ?" fc 7^ i if n — m = — 1 and n > 0. 

k (2n+a+o)(2n+a+o— 1) ' — 



This expression is non-zero for —1 < rtj — m; < 1. Furthermore, the form of the integral for 
j = i — 1 depends on this difference, or rather the difference between Nj and Mj. Depending on 
the difference Nj — Mj we have to consider the integrals 

l-r l\^\mtf +2 Xi)^ + 2{€)dt, (33) 



R^HoRtfHoa - o^ao^, (34) 



+ 1 : f 1 Ri a /\0R^f- 2) (0^aA0,dt (35) 



22 



with a = 9j and (3 = Qj + 2Nj. In expression (35) we have to assume f3 > 2, which is equivalent 
to Nj > 1. This holds true, because if Nj = 0, this case would not have to be considered. 

Applying relation (12) twice to the polynomial Rh*'^ (£) in equation (33) and using orthogonality 



yields 



ui.<x,P) x 



+ 



u(<*,P) x 

n rr> ■ Un 



_l rf (o>/3) x 



for some constants HjLm ■ 



Here if, 



0,-1 



(«,/?) 



0,-2 



H 



(a./3) 
1,-1 



the expression for j is non-zero for nrij = n 3 -, r 
applied to the term — £), together with orthogonality to get 



0. Thus, in the case Mj = Nj + 1, 
1, and rij — 2. Furthermore, relation (10) can be 



X . , _|_ r(a,0) X 
1 n j ,mj' J nj+l,mj T 1 nj,mj U nj,m j 



t 1 n i ,m i u nj-l,mj 



for given constants I^jn\ with /_^'o^ 



(«,/3) 



0,-1 



0. In the case M,- = iV,-, the expression is non- 



l,rij, and + 1. Finally, applying relation (12) to the term R 



zero for nij = rij 
expression (35) combined with orthogonality yields 



(0 m 



7K/ 3 ) x _i_ j(<*,P) x . , _i_ r(«>^) a . „ 



for given constants J^ffl . Again ji^'g'' = J^fi^d = = 0- Thus this expression is non-zero for 



mi 



nj,rij + 1, and + 2. The constants Hr£m,In,m, Jn,m are given by 



(n+a+b-l)(n+a+b) 
(2n+a+b-l)(2n+a+b) ' 
2a 
a+fe+2 ' 

2(n+a-l)(n+a+6-l) 
(2n+a+fe-2)(2n+a+b) ' 
(n+a-2)(n+a-l) 
k (2n+a+6-2) (2n+a+6-l) ' 



if m — n = and n > 0, 



if m — n 
if m — n 
if m — n 



-1 and n = 1 
-1 and n > 1, 
-2 and n > 1, 



j(a,b) 



j(a,b) 



1 

a+fe' 

(n+l)(n+a+6-l) 
(2n+a+b~l){2n+a+b) ' 

6 

a+6' 
b 2 +a(&+2) 



(a+fe)(a+fe+2) ' 
b 2 +2n(n+a-l)+b(2n+a-2) 

(2n+a+b-2)(2n+a+fe) : 
ab 

(a+b)(a+b+l) ' 

(n+a-l)(n+b-l) 

~ (2n+a+fe-2)(2n+a+b-l) ' 
(6-l)(&-2) 



(a+6-l)(a+6-2) ' 

(n+fe-2)(ra+fe-l) 
(2n+a+b-2)(2n+a+fe-l) ' 
2(n+l)(ra+6-l) 
(2n+a+b-2){2n+a+b) ' 
(n+l)(n+2) 
k. (2n+a+b-l)(2n+a+b) ' 



lira — n 
lira — n 



1 and n = 0, 
1 and n > 0, 



if m — n = and n = 0, 
if m — n = and n = 1, 
if m — n = and n > 1, 
if m — n = — 1 and n = 1 
if m — n = —1 and n > 1, 



if m 
if m 
if m 
if m 



n 
n 
n 
n 



and n = 0, 

and n > 0, 

1 and n > 0, 

2 and n > 0. 



Now considering all three possible values for Nj — Mj, and all possible implications for the 
difference rij — rrij, it can be shown that 1 < Nj-% — Mf_i < 1 has to hold as well. Using induction 
shows that 1 < Nj — Mj < 1 holds for all j < i. Thus for all j < i the same integrals (33), (34), 
and (35), with adjusted parameters a = 6j and /3 = Qj + 2Nj, have to be considered. 
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Combining these results shows that for fixed i and n the polynomials with a non-zero contribu- 
tion to the recurrence relation for XjP„(x) are exactly those with indices from the set 



Mi(n) := m6l 



rrij > Vj, Mj = Nj Vj > i, \M S - N, \ < 1 Vj < z} , 



defined in (16). Thus, 



x i^n ( x ) — ^ ^ r n,m-P m ( x ) ; 



where the coefficients rjf^ are given by 



_ ^(^,0,4-2^) TT 
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(%,e,+2jv 3 -) 

1 rij,nij j 


if ^ 




= o, 


,(^-,G j +2iV j ) 


if A^ 


-M, 


= +1. 



□ 

Proof of Lemma 5. Using the coordinate transformation introduced in Appendix B, and applying 
Jz?o> given in equation (38), to -P„(£) from equation (37) yields 



K-l 



K-l 



^(e^E n n n II <^ +2JVj) fe)(i-^ 
^(i-^S{< i ' e<+2JV<) te)(i-^} 



(36) 



Employing equation (8), one can show that the terms in the brackets on the right hand side of 
equation (36) reduce to 



0.-& Ni RV i > ei+2Nt H&(-N ir . 1 (N t 
and substitution yields 
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A^_ 1 (Ar i _ 1 -i + e 



i-l 



with Af n| = \ 



n|(|n| - 1 + |0|), since O = |0|, N = |n|, and N K - 



□ 
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B Change of coordinates 



Working with the multivariate Jacobi polynomials and the neutral diffusion, it is convenient, for 
some derivations, to transform the equations to a different coordinate system. This transformation 
maps the simplex A.k-1 to the K — 1-dimensional unit cube [0, It is implicitly used in 
Griffiths and Spano (2011, Section 3), but more explicitly introduced and used as a transformation 
in Baxter et al. (2007). The vector £(x) = (£i(x), . . . ,£x-i(jc)) is obtained from the vector of 
population frequencies x via the transformation 

6 



1 - Ej<i x i 



for 1 < i < K — 1. The inverse of this transformation is given by 

j<i 

for 1 < i < K — 1. The inverse relation can be derived by noting that 1 — Sj<i x j = Tlj<i(l — £j) 
holds. 

Definition 1 yields immediately that the multivariate Jacobi polynomials -Pn(x) take the form 

= = n<" ' +2JVi) ^)(i - (37) 

in the transformed coordinates. The neutral diffusion generator Jzfo in the transformed coordinate 
system is given by the following lemma. 

Lemma 7. Using variables in the new coordinate system, the backward generator of the diffusion 
under neutrality can be written as 

= \Y. kii^m + E ^o^Jm (38) 

i=l 

with 



6(1-6) 



n*<i(i-&) 

a i(£) = 



2iifc<i(i-efc) 



The proof of this lemma is paraphrased in Appendix B of Baxter et al. (2007). The transfor- 
mation diagonalizes the operator by removing all the mixed second order partial derivatives. 
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C Coefficients of the polynomial Q(x;cr,0) 

l K 

q = -^(yZ e j a K,j - \0\vk,k) when i = 0, 
1 ^ 



-2(1 + |0|K,jr + (1 + 2|0|)<7*,jr + ^.ij, 

1 

2< 

■(1 + \d\)(a iu i 2 + o-^k - 2(7 i2) i<r)), 



q(h,i2,h) = 2^ ail ' i3 ~~ a ii,K)(^ii,i 2 ~~ °n,i<") ~~ ( a i3,K - &K,K)(<?i 2 ,K ~ <?K,k) 
~^( a i2,h + a K,K - 2crj 3) A')((Ti li A' - &K,k)), 
q(h, 12, i3, u) =— -((cr^^ + aK,K - 2&i 2 ,K)(o-i :j ,i 4 + &K,K — 2&i 4 ,K))- 

D Derivation of equation (25) 

Applying Jzf to S*(x), 

JS?S£(x) = (jSf + J^)(^(x) e - s W/ 2 ) 

/4 <r(x) <^(x:) /4 

+ if(x)jSf ff e-— + e~— J^(x) 

i,j=l * 3 

It can be shown that the last two terms in the above expression sum up to 0. Note that for 
l<i,j<K-l, 

d K K 

— o-(x) = 2 <Tfc,iX fe - 2 g^zi, (39) 
x * fc=i i=i 

cr(x) = 2(<Tij - cr^A- - &i,K + cr^.iir)- (40) 



dxjdxi 
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It follows that 

A-l 



y=l 



e 2 



#-i ^ 



i=l 



K-l 



K-l 



2 1 dxt 

K 



2 ax 



i=i j=i 



i=i 



e 2 



K-i / A 



A' 



E ^ ( E ^M 2 * ~ E °*'.^ ) ft^T ( P n ( x ) } 



1=1 ^fc=l 

A-l / A 



i=l 



9 



A-l , A A v ^ 

+ E ( E " E -qt. { P n ( X ) } 

i,7'=l ^fc=l 1=1 ' i 



;3 

K-l 



i=l 



i=i 



1=1 



e 2 



A-l 







a 



E X ^( P n(x)}E a ^ 



i=l 



Z=l 



A-l ^ .A A-l A 

E Xj d^~.^ P ^ X ^\ E a l,KXKXl + E z » E °"l,A^l 
3=1 3 



1=1 



i=l 1=1 



where we used equation (39) for the second equality and Yli=i x i = ^ f° r the last equality. 
Further, using equation (39) and (40) one can show that 



<r(x) 



1 g(x] 

-e 2 
2 



<r(x) 



A 



A 



A 



XiO?(x) - E X i°H + (! + l IM X ) + ^( X ) 2 " E 



i=l 



i=l 



i=l 



= -e 2 Q(x;<r,0), 

where Q takes the form (27), that is Q(x;cr,6) = X^iez^(*) x i' w ith the constants q(i) given in 
Appendix C. 
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